-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve CommonSubexprEliminate
identifier management (10% faster planning)
#10473
Conversation
878cc04
to
4b0608c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very exciting @peter-toth
datafusion/common/src/tree_node.rs
Outdated
@@ -204,6 +214,24 @@ pub trait TreeNode: Sized { | |||
apply_impl(self, &mut f) | |||
} | |||
|
|||
fn apply_ref<'n, F: FnMut(&'n Self) -> Result<TreeNodeRecursion>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API would be helpful in other areas to avoid cloning -- I am very much in favor of adding it
I think @peter-toth plans to break this PR up into smaller ones, so marking it as a draft to make it clear it isn't waiting on more feedback. If I am mistaken, please let me know |
Yes, here is the first part that adds the new |
4b0608c
to
14644c3
Compare
14644c3
to
52d2e23
Compare
9a9f4fe
to
fc71133
Compare
@alamb, can you please help me with that MSRV failure? I don't know what could be the source of the failure and if I try running
|
assert_eq!(window_exprs.len(), arrays_per_window.len()); | ||
let num_window_exprs = window_exprs.len(); | ||
let rewritten_window_exprs = self.rewrite_expr( | ||
window_exprs.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extra clone might look bad at first, but technically this is due to we use references to original expressions in Identifier
s so we have to keep the original expressions intact.
- The reason why this is not worse than before is because previously the
Identifiers
were stringified expression trees (so they were kind of clones of the original expressions). - The reason why this can be better than before is because we introduce the
found_common
flag above that allows skipping cloning here when it make no sense to trigger the second, rewriting traversal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
window_exprs.clone(), | |
// Must clone as Identifiers use references to original expressions | |
// so we have to keep the original expressions intact. | |
window_exprs.clone(), |
in terms of the cost of the extra clone, I think the performance results speak for themselves
if group_found_common || aggr_found_common { | ||
// rewrite both group exprs and aggr_expr | ||
let rewritten = self.rewrite_expr( | ||
vec![group_expr.clone(), aggr_expr.clone()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra clone, but the reason is the same as https://github.com/apache/datafusion/pull/10473/files#r1648985272
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this only happens when we are actually doing a CSE rewrite (and not on all plans) I think it is fine
if aggr_found_common { | ||
let mut common_exprs = CommonExprs::new(); | ||
let mut rewritten_exprs = self.rewrite_exprs_list( | ||
vec![new_aggr_expr.clone()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra clone, but the reason is the same as https://github.com/apache/datafusion/pull/10473/files#r1648985272
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in ee61224.
|
||
if found_common { | ||
let rewritten = self.rewrite_expr( | ||
vec![expr.clone()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra clone, but the reason is the same as https://github.com/apache/datafusion/pull/10473/files#r1648985272
|
||
self.id_array[down_index].0 = self.up_index; | ||
if !self.expr_mask.ignores(expr) { | ||
self.id_array[down_index].1.clone_from(&expr_id); | ||
let count = self.expr_stats.entry(expr_id.clone()).or_insert(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer need to clone Identifier
s as they are Copy
.
This PR is more or less ready for review, tests are passing except for the MSRV. Local benchmarks show good perfroamance improvements, but @alamb please confirm it with your standard setup.
|
I am starting the benchmark run now -- I'll report back here and give this PR a look as soon as possible (but maybe not until tomorrow) Thank you so much @peter-toth |
My results appear consistent with yours @peter-toth -- looks like maybe the high column count case is slightly worse -- I can maybe profile it to see if I can find any reason that might be
|
b23529a
to
a0913aa
Compare
I've bumped MSRV to 1.76 in ccc92b9. It we want to avoid that then let me know and I will try to find out what is not available in 1.75. |
d69098d
to
1a6f8e7
Compare
I ran the benchmarks again: Looks to me like this PR makes planning 10% faster
🚀
|
CommonSubexprEliminate
identifier management (10% faster planning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow @peter-toth - this is pretty amazing. Both in terms of code as well as ability to stick to it and keep it moving through many different iterations.
I had some minor comment suggestions, but really nothing of substance to add. I think once this PR's conflicts were resolved we could merge this PR.
cc @waynexia
@@ -26,7 +26,7 @@ homepage = { workspace = true } | |||
repository = { workspace = true } | |||
license = { workspace = true } | |||
authors = { workspace = true } | |||
rust-version = "1.75" | |||
rust-version = "1.76" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By my reading of the MSRV policy
Lines 100 to 103 in d8bcff5
DataFusion's Minimum Required Stable Rust Version (MSRV) policy is to support | |
each stable Rust version for 6 months after it is | |
[released](https://github.com/rust-lang/rust/blob/master/RELEASES.md). This | |
generally translates to support for the most recent 3 to 4 stable Rust versions. |
1.75
was released 2023-12-28
meaning we need to keep the MSRV at 1.75 until 2024-06-28 (6 days from now)
However, since we aren't going to make a release until around July 11 #11077 this is probably ok
if group_found_common || aggr_found_common { | ||
// rewrite both group exprs and aggr_expr | ||
let rewritten = self.rewrite_expr( | ||
vec![group_expr.clone(), aggr_expr.clone()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this only happens when we are actually doing a CSE rewrite (and not on all plans) I think it is fine
if aggr_found_common { | ||
let mut common_exprs = CommonExprs::new(); | ||
let mut rewritten_exprs = self.rewrite_exprs_list( | ||
vec![new_aggr_expr.clone()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can add a comment
.map(LogicalPlan::Projection) | ||
.map(Transformed::yes) | ||
} else { | ||
// TODO: How exactly can the name or the schema change in this case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something you plan to do in this PR? Or is it for follow up work (I can file a new ticket if it is follow on work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I didn't want to remove this part in this PR, but I think we can get rid of it in a follow-up one. If you file a ticket then please ping me or assign it to me. I will be offline from tomorrow for a about a week, but when I'm back I'm happy to fix this.
@@ -524,41 +667,24 @@ impl CommonSubexprEliminate { | |||
/// ``` | |||
/// | |||
/// where, it is referred once by each `WindowAggr` (total of 2) in the plan. | |||
struct ConsecutiveWindowExprs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense -- I put it all in a struct initially to try and encapsulate the logic -- I think your changes look good to me
@@ -507,7 +642,7 @@ impl CommonSubexprEliminate { | |||
/// ``` | |||
/// | |||
/// Returns: | |||
/// * `window_exprs`: `[a, b, c, d]` | |||
/// * `window_exprs`: `[[a, b, c], [d]]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
# Conflicts: # datafusion/optimizer/src/common_subexpr_eliminate.rs
🚀 thanks again @peter-toth |
…anning) (apache#10473) * implement hash based CSE identifier * move `is_volatile()` check out of visitor * fix comments * add transformed asserts when `rewrite_expr` is called * update MSRV to 1.76 * cleanup * better comments regarding volatile and short circuiting expressions * address review comments
Which issue does this PR close?
Closes #10426.
Rationale for this change
Now that #10832, #10939 and #10835 have landed, this PR adds 3 optimizations to CSE:
Currently the
CommonSubexprEliminate
uses sting identifiers to encode exression trees (e.g. the expressioncol("a") + 1
is encoded as"{a + Int32(1)|{Int32(1)}|{a}}"
) which can cause performance problems due to identidiers are copied and concatenated many times during CSE.This PR changes the implementation of identifiers to:
This struct contains:
Moves
is_volatile()
check of expressions out of the first, visiting traversal. Asis_volatile()
check is an expression tree check implemented withExpr::exists()
checking the whole expression tree only once is enough and more effective than checking all of its subtrees again and again during a traversal.Modifies
expr_to_identifier()
andto_arrays()
to return a boolean flag if executing the second, rewriting traversal is needed or we can skip it.What changes are included in this PR?
Identifier
implementation.Expr::hash_node()
method to build the hashcode of a node's direct content (without the children) to be able to calculate the hash of an expression tree effectively.expr_to_identifier()
andto_arrays()
methods to return a boolean flag besides theIdArray
.Are these changes tested?
Yes, with existing UTs.
Are there any user-facing changes?
No.